{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Manual CPA Attack\n",
    "\n",
    "Supported setups:\n",
    "\n",
    "SCOPES:\n",
    "\n",
    "* OPENADC\n",
    "* CWNANO\n",
    "\n",
    "PLATFORMS:\n",
    "\n",
    "* CWLITEARM\n",
    "* CWLITEXMEGA\n",
    "* CWNANO"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SCOPETYPE = 'OPENADC'\n",
    "PLATFORM = 'CWLITEARM'\n",
    "CRYPTO_TARGET = 'TINYAES128C'\n",
    "num_traces = 50\n",
    "CHECK_CORR = False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash -s \"$PLATFORM\" \"$CRYPTO_TARGET\"\n",
    "cd ../hardware/victims/firmware/simpleserial-aes\n",
    "make PLATFORM=$1 CRYPTO_TARGET=$2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The CPA Attack Theory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a background on the CPA attack, please see the section [Correlation Power Analysis](https://wiki.newae.com/Correlation_Power_Analysis). It's also suggested that you complete Tutorial B1, since that will introduce Jupyter and show how to interface with ChipWhipserer using Python. It's assumed you've read that section and completed the tutorial and come back to this. Ok, you've done that? Good let's continue.\n",
    "\n",
    "Assuming you **actually** read that, it should be apparent that there is a few things we need to accomplish:\n",
    "\n",
    "1. Getting some power traces of our target while it's performing AES encryption.\n",
    "1. Reading the data, which consists of the analog waveform (trace) and input text sent to the encryption core\n",
    "1. Making the power leakage model, where it takes a known input text along with a guess of the key byte\n",
    "1. Implementing the correlation equation, and then looping through all the traces\n",
    "1. Ranking the output of the correlation equation to determine the most likely key.\n",
    "\n",
    "This tutorial will deal with both recording power traces using ChipWhisperer and breaking them using a CPA attack."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Capturing Power Traces"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Capturing power traces will be very similar to previous tutorials, except this time we'll be using a loop to capture multiple traces, as well as numpy to store them. It's not necessary, but we'll also plot the trace we get using `bokeh`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll use some helper scripts to make setup and programming easier. If you're using an XMEGA or STM (CWLITEARM) target, binaries with the correct should be setup for you:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%run \"Helper_Scripts/Setup_Generic.ipynb\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fw_path = '../hardware/victims/firmware/simpleserial-aes/simpleserial-aes-{}.hex'.format(PLATFORM)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cw.program_target(scope, prog, fw_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Capturing Traces"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below you can see the capture loop. The main body of the loop loads some new plaintext, arms the scope, sends the key and plaintext, then finally records and appends our new trace to the `traces[]` list. At the end, we convert the trace data to numpy arrays, since that's what we'll be using for analysis."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Capture Traces\n",
    "from tqdm import tnrange\n",
    "import numpy as np\n",
    "import time\n",
    "\n",
    "ktp = cw.ktp.Basic()\n",
    "\n",
    "traces = []\n",
    "\n",
    "for i in tnrange(num_traces, desc='Capturing traces'):\n",
    "    key, text = ktp.next()  # manual creation of a key, text pair can be substituted here\n",
    "    trace = cw.capture_trace(scope, target, text, key)\n",
    "    if trace is None:\n",
    "        continue\n",
    "    traces.append(trace)\n",
    "\n",
    "#Convert traces to numpy arrays\n",
    "trace_array = np.asarray([trace.wave for trace in traces])  # if you prefer to work with numpy array for number crunching\n",
    "textin_array = np.asarray([trace.textin for trace in traces])\n",
    "known_keys = np.asarray([trace.key for trace in traces])  # for fixed key, these keys are all the same"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have our traces, we can also plot them using Bokeh:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bokeh.plotting import figure, show\n",
    "from bokeh.io import output_notebook\n",
    "\n",
    "output_notebook()\n",
    "p = figure()\n",
    "\n",
    "xrange = range(len(traces[0].wave))\n",
    "p.line(xrange, traces[2].wave, line_color=\"red\")\n",
    "show(p)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# cleanup the connection to the target and scope\n",
    "scope.dis()\n",
    "target.dis()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Trace Analysis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using the Trace Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have some traces, let's look at what we've actually recorded. Looking at the earlier parts of the script, we can see that the trace data is in `trace_array`, while `textin_array` stores what we sent to our target to be encrypted. For now, let's get some basic information (the total number of traces, as well as the number of sample points in each trace) about the traces, since we'll need that later:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "numtraces = np.shape(trace_array)[0] #total number of traces\n",
    "numpoints = np.shape(trace_array)[1] #samples per trace"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the analysis, we'll need to loop over every byte in the key we want to attack, as well as every trace:\n",
    "```python\n",
    "for bnum in range(0, 16):\n",
    "    for tnum in range(0, numtraces):\n",
    "        pass\n",
    "```\n",
    "Though we didn't loop over them, note that each trace is made up of a bunch of sample points.\n",
    "Let's take a closer look at AES so that we can replace that `pass` with some actual code."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Calculating Correlation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have some power traces of our target that we can use, we can move on to the next steps of our attack. Looking way back to how AES works, remember we are effectively attemping to target the position at the bottom of this figure:\n",
    "\n",
    "![title](https://wiki.newae.com/images/7/71/Sbox_cpa_detail.png)\n",
    "\n",
    "The objective is thus to determine the output of the S-Box, where the S-Box is defined as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sbox = (\n",
    "    0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,\n",
    "    0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,\n",
    "    0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,\n",
    "    0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,\n",
    "    0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,\n",
    "    0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,\n",
    "    0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,\n",
    "    0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,\n",
    "    0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,\n",
    "    0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,\n",
    "    0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,\n",
    "    0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,\n",
    "    0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,\n",
    "    0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,\n",
    "    0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,\n",
    "    0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thus we need to write a function taking a single byte of input, a single byte of the guessed key, and return the output of the S-Box:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def intermediate(pt, keyguess):\n",
    "    return sbox[pt ^ keyguess]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, remember we want the Hamming Weight of the guess. Our assumption is that the system is leaking the Hamming Weight of the output of that S-Box. As a dumb solution, we could first convert every number to binary and count the 1's:\n",
    "\n",
    "```python\n",
    ">>> bin(0x1F)\n",
    "'0b11111'\n",
    ">>> bin(0x1F).count('1')\n",
    "5\n",
    "```\n",
    "This will ultimately be fairly slow. Instead we make a lookup table using this idea:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "HW = [bin(n).count(\"1\") for n in range(0, 256)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Performing the Check"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember the objective is to calculate the following:\n",
    "$$r_{i,j} = \\frac{\\sum_{d=1}^{D}[(h_{d,i} - \\bar{h_i})(t_{d,j}-\\bar{t_j})]}{\\sqrt{\\sum_{d=1}^D(h_{d,i}-\\bar{h_i})^2\\sum_{d=1}^D(t_{d,j}-\\bar{t_j})^2}}$$\n",
    "\n",
    "Where\n",
    "\n",
    "| **Equation** | **Python Variable** | **Value**  | \n",
    "|--------------|---------------------|------------|\n",
    "|  d           |       tnum          | trace number |\n",
    "|  i           |       kguess        | subkey guess |\n",
    "| j | j index trace point | sample point in trace |\n",
    "| h | hypint | guess for power consumption | \n",
    "| t | traces | traces | \n",
    "\n",
    "It can be noticed there is effectively three sums, all sums are done over all traces. For this initial implementation we'll be explicitly calculating some of these sums, although it's faster to use NumPy to calculate across large arrays. We'll convert those three summations into variables, turning the equation into this format:\n",
    "\n",
    "$$r_{i,j}=\\frac{sumnum}{\\sqrt{snumden1 * sumden2}}$$\n",
    "\n",
    "Where:\n",
    "\n",
    "$$sumnum = \\sum_{d=1}^{D}[(h_{d,i} - \\bar{h_i})(t_{d,j}-\\bar{t_j})]$$\n",
    "\n",
    "$$sumden1 = \\sum_{d=1}^D(h_{d,i}-\\bar{h_i})^2$$\n",
    "\n",
    "$$sumden2 = \\sum_{d=1}^D(t_{d,j}-\\bar{t_j})^2$$\n",
    "\n",
    "Looking at this, we can see that we'll need $\\bar{h_i}$ and $\\bar{t_j}$, so let's start by building some code that will give us those. Looking at the CPA tutorial, we can see that $h_{d,i}$ is just our guess for the power consumption in trace $d$ for subkey $i$. We can get that easily using the `HW` array and `intermediate()` function we defined earlier:\n",
    "\n",
    "```python\n",
    "for bnum in range(0, 16):\n",
    "    cpaoutput = [0]*256\n",
    "    maxcpa = [0]*256\n",
    "    for kguess in range(0, 256):\n",
    "        hyp = np.zeros(numtraces)\n",
    "        for tnum in range(0, numtraces):\n",
    "            hyp[tnum] = HW[intermediate(pt[tnum][bnum], kguess)]\n",
    "```\n",
    "\n",
    "Now we can get $\\bar{h_i}$:\n",
    "```python\n",
    "meanh = np.mean(hyp, dtype=np.float64)\n",
    "```\n",
    "\n",
    "and $\\bar{t_j}$ is just the mean of all of our traces:\n",
    "```python\n",
    "meant = np.mean(traces, axis=0, dtype=np.float64)\n",
    "```\n",
    "\n",
    "Next, let's move on to calculating the whole sums using $h_{d,i}$ and $t_{d,j}$ and the values we just calculated:\n",
    "\n",
    "```python\n",
    "#For each trace, do the following\n",
    "for tnum in range(numtraces):\n",
    "    hdiff = (hyp[tnum] - meanh)\n",
    "    tdiff = traces[tnum,:] - meant\n",
    "\n",
    "    sumnum = sumnum + (hdiff*tdiff)\n",
    "    sumden1 = sumden1 + hdiff*hdiff \n",
    "    sumden2 = sumden2 + tdiff*tdiff\n",
    "```\n",
    "\n",
    "We can now get the correlation for each of our subkey guesses, which we'll call `cpaoutput[]`:\n",
    "\n",
    "```python\n",
    "cpaoutput[kguess] = sumnum / np.sqrt( sumden1 * sumden2 )\n",
    "```\n",
    "\n",
    "We're almost done! All that's left is to use that correlation to figure out which subkey best matches our power traces. First off, we only care about absolute value of the correlation (that there is a linear correlation), not sign. Additionally, though this didn't factor into our correlation calculation, remember that each trace was actually made up of a bunch of sample points. This means that what we actually have is the correlation of each subkey guess to each sample point. Typically only a few points in the trace are correlating, and it's the maximum across the entire trace we are concerned with, so we can pick our correlation for each subkey by:\n",
    "\n",
    "```python\n",
    "maxcpa[kguess] = max(abs(cpaoutput[kguess]))\n",
    "```\n",
    "\n",
    "Finally, we can find the subkey that best matches our data by finding the one with the biggest correlation:\n",
    "\n",
    "```python\n",
    "bestguess[bnum] = np.argmax(maxcpa)\n",
    "```\n",
    "\n",
    "Putting it all together:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Finished Script"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from tqdm import tqdm\n",
    "\n",
    "sbox = (\n",
    "    0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,\n",
    "    0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,\n",
    "    0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,\n",
    "    0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,\n",
    "    0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,\n",
    "    0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,\n",
    "    0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,\n",
    "    0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,\n",
    "    0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,\n",
    "    0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,\n",
    "    0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,\n",
    "    0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,\n",
    "    0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,\n",
    "    0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,\n",
    "    0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,\n",
    "    0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16)\n",
    "\n",
    "def intermediate(pt, keyguess):\n",
    "    return sbox[pt ^ keyguess]\n",
    "\n",
    "HW = [bin(n).count(\"1\") for n in range(0, 256)]\n",
    "\n",
    "numtraces = np.shape(trace_array)[0] #total number of traces\n",
    "numpoint = np.shape(trace_array)[1] #samples per trace\n",
    "\n",
    "pt = textin_array\n",
    "knownkey = traces[0].key\n",
    "cparefs = [0] * 16\n",
    "bestguess = [0]*16\n",
    "\n",
    "for bnum in tqdm(range(0, 16), desc='Attacking subkeys'):\n",
    "    cpaoutput = [0] * 256\n",
    "    maxcpa = [0] * 256\n",
    "    for kguess in range(0, 256):\n",
    "\n",
    "        # Initialize arrays &amp; variables to zero\n",
    "        sumnum = np.zeros(numpoint)\n",
    "        sumden1 = np.zeros(numpoint)\n",
    "        sumden2 = np.zeros(numpoint)\n",
    "\n",
    "        hyp = np.zeros(numtraces)\n",
    "        for tnum in range(0, numtraces):\n",
    "            hyp[tnum] = HW[intermediate(pt[tnum][bnum], kguess)]\n",
    "\n",
    "        # Mean of hypothesis\n",
    "        meanh = np.mean(hyp, dtype=np.float64)\n",
    "\n",
    "        # Mean of all points in trace\n",
    "        meant = np.mean(trace_array, axis=0, dtype=np.float64)\n",
    "\n",
    "        # For each trace, do the following\n",
    "        for tnum in range(0, numtraces):\n",
    "            hdiff = (hyp[tnum] - meanh)\n",
    "            tdiff = trace_array[tnum, :] - meant\n",
    "\n",
    "            sumnum = sumnum + (hdiff * tdiff)\n",
    "            sumden1 = sumden1 + hdiff * hdiff\n",
    "            sumden2 = sumden2 + tdiff * tdiff\n",
    "\n",
    "        cpaoutput[kguess] = sumnum / np.sqrt(sumden1 * sumden2)\n",
    "        maxcpa[kguess] = max(abs(cpaoutput[kguess]))\n",
    "\n",
    "    bestguess[bnum] = np.argmax(maxcpa)\n",
    "    cparefs[bnum] = np.argsort(maxcpa)[::-1]\n",
    "\n",
    "print(\"Best Key Guess: \", end=\"\")\n",
    "for b in bestguess: print(\"%02x \" % b, end=\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Checking Our Guess"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since we were the ones who decided the key (and kept track of it in knownkey), we can check if our guess was right:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for b in knownkey: print(\"%02x \"%b, end=\"\")\n",
    "print(\"\\n\")\n",
    "if all([known_byte == guess_byte for known_byte, guess_byte in zip(knownkey, bestguess)]):\n",
    "    print(\"Guess was right\")\n",
    "else:\n",
    "    print(\"Guess was wrong\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can do even better by calculating the Partial Guessing Entropy.\n",
    "\n",
    "The Partial Guessing Entropy (PGE) is a useful metric of where the correct answer is ranked. This requires us to know the actual encryption key used during operation, which we do know! \n",
    "\n",
    "Certain attacks will use different keys during the acqusition period, meaning the a list of all the known keys is required since there isn't a constant key. In our case, we already have access in `knownkey`.\n",
    "\n",
    "Previously, we simply printed the maximum output for each subkey as follows:\n",
    "\n",
    "```python\n",
    "#Find maximum value of key\n",
    "bestguess[bnum] = np.argmax(maxcpa)\n",
    "```\n",
    "\n",
    "To sort the list of CPA output's, we'll use the `argsort()` function from NumPy. This will output a list where the first element is the index of the lowest value, next element is the index of the next-highest element, etc. Because in our input list the `maxcpa` vector's indexes correspond to the key guess, this allows us to know where the keys are. We reverse that sorted list to put the first element as the maximum CPA output:\n",
    "\n",
    "```python\n",
    "cparefs = np.argsort(maxcpa)[::-1]\n",
    "```\n",
    "\n",
    "Finally, the Partial Guessing Entropy is simply the location of the known correct key byte inside that array. We can find that with the `.index()` function:\n",
    "\n",
    "```python\n",
    "print cparefs.index(0x2B)\n",
    "```\n",
    "\n",
    "Where the correct key should of course come from our `knownkey` variable instead of being hard-coded. Pulling it all together:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pge = [0]*16\n",
    "for bnum in range(0, 16):\n",
    "    #Find maximum value of key\n",
    "    #Find PGE\n",
    "    pge[bnum] = list(cparefs[bnum]).index(knownkey[bnum])\n",
    "    \n",
    "print(\"PGE: \", end=\"\")\n",
    "for i in pge:\n",
    "    print(i, end=\"\")\n",
    "print(\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Improvements"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The implementation of the correlation function runs as a loop over all traces. Ideally we'd like to implement this as a 'online' calculation; that is, we can add a trace in, observe the output, add another trace in, observe the output, etc. When generating plots of the Partial Guessing Entropy (PGE) vs. number of traces this is greatly preferred, since otherwise we need to run the loop many times!\n",
    "\n",
    "We can use an alternate form of the correlation equation, which explicitly stores sums of the variables. This is easier to perform online calculation with, since when adding a new trace it's simple to update these sums. This form of the equation looks like:\n",
    "\n",
    "$$r_{i,j} = \\frac{D\\sum_{d=1}^{D}h_{d,i}t_{d,j}-\\sum_{d=1}^{D}h_{d,i}\\sum_{d=1}^{D}t_{d,j}}{\\sqrt{((\\sum_{d=1}^Dh_{d,i})^2-D\\sum_{d=1}^Dh_{d,i}^2)-((\\sum_{d=1}^Dt_{d,j})^2-D\\sum_{d=1}^Dh_{d,j}^2)}}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert all([known_byte == guess_byte for known_byte, guess_byte in zip(knownkey, bestguess)]), \"Failed to break encryption key\\nGot: {}\\nExpected: {}\\n\".format(knownkey, bestguess)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert not (np.all(pge)), \"PGE not zero for byte: {}\".format(pge)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
